gradient descent について

Words near each other

・ "O" Is for Outlaw
・ "O"-Jung.Ban.Hap.
・ "Ode-to-Napoleon" hexachord
・ "Oh Yeah!" Live
・ "Our Contemporary" regional art exhibition (Leningrad, 1975)
・ "P" Is for Peril
・ "Pimpernel" Smith
・ "Polish death camp" controversy
・ "Pro knigi" ("About books")
・ "Prosopa" Greek Television Awards
・ "Pussy Cats" Starring the Walkmen
・ "Q" Is for Quarry
・ "R" Is for Ricochet
・ "R" The King (2016 film)
・ "Rags" Ragland
・ ! (album)
・ ! (disambiguation)
・ !!
・ !!!
・ !!! (album)
・ !!Destroy-Oh-Boy!!
・ !Action Pact!
・ !Arriba! La Pachanga
・ !Hero
・ !Hero (album)
・ !Kung language
・ !Oka Tokat
・ !PAUS3
・ !T.O.O.H.!
・ !Women Art Revolution

Dictionary Lists

mini英和辞書

翻訳と辞書　辞書検索 [ 開発暫定版 ]

スポンサードリンク

gradient descent ：ウィキペディア英語版

gradient descent

Gradient descent is a first-order optimization algorithm. To find a local minimum of a function using gradient descent, one takes steps proportional to the ''negative'' of the gradient (or of the approximate gradient) of the function at the current point. If instead one takes steps proportional to the ''positive'' of the gradient, one approaches a local maximum of that function; the procedure is then known as gradient ascent.
Gradient descent is also known as steepest descent, or the method of steepest descent. However, gradient descent should not be confused with the method of steepest descent for approximating integrals.
==Description==

Gradient descent is based on the observation that if the multivariable function

F(\mathbf)

is defined and differentiable in a neighborhood of a point

\mathbf

, then

F(\mathbf)

decreases ''fastest'' if one goes from

\mathbf

in the direction of the negative gradient of

F

\mathbf

-\nabla F(\mathbf)

. It follows that, if
:

\mathbf = \mathbf-\gamma\nabla F(\mathbf)

for

\gamma

small enough, then

F(\mathbf)\geq F(\mathbf)

. With this observation in mind, one starts with a guess

\mathbf_0

for a local minimum of

F

, and considers the sequence

\mathbf_0, \mathbf_1, \mathbf_2, \dots

such that
:

\mathbf_=\mathbf_n-\gamma_n \nabla F(\mathbf_n),\ n \ge 0.

We have
:

F(\mathbf_0)\ge F(\mathbf_1)\ge F(\mathbf_2)\ge \cdots,

so hopefully the sequence

(\mathbf_n)

converges to the desired local minimum. Note that the value of the ''step size''

\gamma

is allowed to change at every iteration. With certain assumptions on the function

F

(for example,

F

convex and

\nabla F

Lipschitz) and particular choices of

\gamma

(e.g., chosen via a line search that satisfies the Wolfe conditions), convergence to a local minimum can be guaranteed. When the function

F

is convex, all local minima are also global minima, so in this case gradient descent can converge to the global solution.
This process is illustrated in the picture to the right. Here

F

is assumed to be defined on the plane, and that its graph has a bowl shape. The blue curves are the contour lines, that is, the regions on which the value of

F

is constant. A red arrow originating at a point shows the direction of the negative gradient at that point. Note that the (negative) gradient at a point is orthogonal to the contour line going through that point. We see that gradient ''descent'' leads us to the bottom of the bowl, that is, to the point where the value of the function

F

is minimal.

抄文引用元・出典: フリー百科事典『ウィキペディア（Wikipedia）』
■ウィキペディアで「gradient descent」の詳細全文を読む

スポンサードリンク

翻訳と辞書 : 翻訳のためのインターネットリソース